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Abstract 

Foods naturally contain a number of contaminants that may have different and long term 
toxic effects. This paper introduces a novel approach for the assessment of such chronic food 
risk that integrates the pharmakokinetic properties of a given contaminant. The estimation of 
such a Kinetic Dietary Exposure Model (KDEM) should be based on long term consumption 
data which, for the moment, can only be provided by Household Budget Surveys such as the 
SECODIP panel in France. A semi parametric model is proposed to decompose a series of 
household quantities into individual quantities which are then used as inputs of the KDEM. 
As an illustration, the risk assessment related to the presence of methylmercury in seafoods is 
revisited using this novel approach. 
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Introduction 



The quantitative assessment of dietary exposure to certain contaminants is of high priority to the 
Food and Agricultural Organization and the World Health Organization (FAO/WHO). For exam- 
ple, excessive exposure to methylmercury, a contaminant mainly found in fish and other seafood 
(mollusks and shellfish) may have neurotoxic e ffects such as neuronal loss, ataxia, visual disturbance, 



impaired hearing, and paralysis (jWHO 



1990). Quantitative risk assessments for such chronic risk 
require the comparison between a tolerable dose of the contaminant called Provisional Tolerable 
Weekly Intake (PTWI) and the population's usual intake. The usual intake distribution is gener- 
ally estimated from independent individual food consumption surveys (generally not exceeding 7 
days) and food contamination data. Several models have been developed to es timate the di s tribu 



tion of usual dietary i ntake from short-term measurements (see for example, 



Nusser et al. 



Hoffmann et al 



1993; 



2002 ) . The proportion of consumers whos e usual weekly intake exceeds the PTWI 



Tressou et al 



2004 ). This kind of risk 



can then be viewed as a risk indicator (see for example, 
assessment does not account for the underlying dynamic process, i.e. for the fact that the contami- 
nant is ingested over time and naturally eliminated at a certain rate by the human body. Moreover, 
longer term measurements of consumption are available through household budget surveys (HBS). 
In this paper, we propose to use HBS data to quantify individual long term exposure to a 

d food acquisit i ons w hich are first used 



Chesherl (11997 



19981 ) in the nutrition 



contaminant. This data provides long time series of househo. 
in a decomposition model, similar to the one proposed by 
field, in order to obtain time series of individual intakes. Then, the pharmacokinetic properties of 
the contaminant are integrated into an autoregressive model in which the current body burden is 
defined as a fraction of the previous one plus the current intake. 

From a toxicological point of view, this approach is, to our knowledge, novel and hence requires 
the definition of an ad-hoc long term safe dose as proposed in the next section. We refer to this 
autoregressive model as Kinetic Dietary Exposure Model (KDEM). 

From a statistical point of view, such au toregressive models are well known in general time series 



analysis (see for example. 



Hamilton 



1994 ) and most of the paper is devoted to the description of 



the decomposition model. This statistical model aims at estimating individual quantities from total 
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househo! 



Chesher 



d quantities and structures. This problem is similar to that studied by 



1997 



19981 ^. and 



Engle et al 



U9M) 



Vasdekis and Trichopouloul (12000 ) , and is addressed in a slightly different 



way. In the present article, the individual contaminant intake is firstly viewed as a nonlinear 
function of age within each gender, with time and socioeconomic characteristics being secondly 
introduced in a linear way. The nonlinear function is represented by a trunca ted polynomia l 
splin e of order 1 that admits a mixed model spline representation (section 4.9 in 



Ruppert et al. 



20031 ) . These choices 



Likehhood ( REML, 
compared to 



yield a simple linear mixec 



itterson and Thompsonl . 



mod el which is estimated by REstricted Maximum 



197ll ). One major extension of the proposed model 
Chesheij (jl997l ) is the introduction of dependence between the individual intakes of a 
given household. 

In the next section, focusing on the methylmercury example even though the method is much 
more general and could be applied to any chronic food risk, SECODIP data are described along 
with the construction of a household intake series and the individual cumulative and long term 
exposure concepts yielding the KDEM. Section [2] is devoted to the statistical methodology used to 
decompose the household intake series into individual intake series, namely the presentation of the 
model and its estimation and tests. Section [3] displays the results for the quantification of long term 
exposure to methylmercury of the French population using the 2001 SECODIP panel. Finally, a 
discussion on the use of household acquisition data, with the focus on the French SECODIP panel, 
is conducted in section [4] with respect to the proposed long term risk analysis. 



1 Motivating example: risk related to methylmercury in seafoods 
in the French population 

In this section, the Kinetic Dietary Exposure Model (KDEM) and the concept of long term risk are 
defined. Then a brief panorama of consumption data in France is given and the way the SECODIP 
HBS data will be used as an input of the KDEM is described. 
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1.1 Cumulative exposure and long term risk: the Kinetic Dietary Exposure 
Model (KDEM) 

The main objective of the analysis is to assess individuals' long term exposure to a contaminant to 
deduce whether these individuals are at risk or not. As mentioned in the introduction the only "safe 
dose" reference is the PTWI expressed in terms of body weight {relative intake). Unfortunately, 
TNS SECODIP did not record the body weight of the individuals until 2001. The body weights 



are thus estimated fro m independent data sets; name^ 



consumption (INCA, 



CREDOC-AFSSA-DGAL 



y the French national survey on individual 



I999I ) for pe ople older tha.n 18, and the weekly 



(jl979l )) for individuals 



body weight distribution available from French health records (jSempe et al. 
under 18. In both cases, gender differentiation is introduced. 

Assume that estimations of the individual weekly intakes are available, that is yi^h,t denotes the 
intake of individual i belonging to household h for the t^^ week (with i = 1, . . . , rih^u h = 1, . . . ,H 
and t = 1, . . . ,T), and Di^h,t denotes the same quantity expressed on a body weight basis. The 
cumulative exposure up to the t*^ week of this individual is then given by 



Si,h,t = exp(-7?) • Si^h,t-i + Di^h,t, 



(1.1) 



where 77 > is the natural dissipation rate of the contaminant in the organism. This dissipation 
parameter is defined from the so called half life of the contaminant, which is the time required for 
the body burden to decrease by half in the absence of any new intake. For meth ylmercury, the half 



life, d enoted by /i/2) is estimated to 6 weeks, so that r] = ln(2)//i/2 '■= ln(2)/6 (jSmith and Farrisl . 

mm- 



The autoregressive model defined by p.ip and a given initial state Si^h,o = Di ^fl has a stationary 
solution since exp(— r?) < 1. As a convention, Si^h^ is set to the mean of all positive exposures 
{^i,h,t)i^i rp- However, this convention has little impact on the level of an individual's long term 
exposure since the contribution of the initial state Si^h^ tends to zero as t increases. We call this 
autoregressive model "KDEM" for Kinetic Dietary Exposure Model. 

The individual cumulative exposure Si^h,t can be considered to be the long term exposure of an 
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individual for sufficiently large values of t. For methylmercury, the long term steady state of the 
individual exposure to a contaminant is reached after 5 or 6 half lives according to Dr P. Granjean, 
a methylmercury expert. Thus, the long term individual's exposure to methylmercury is defined 
as the cumulative exposure reached after say 6^1/2 = 36 weeks. 

The risk assessment usually consists of comparing the exposure with the so called Provisional 
Tolerable Weekly Intake (PTWI). This tolerable dose, determined from animal experiments and 
extrapolated to humans, refers to the dose an individual can ingest throughout his entire life 
without appreciable risk. For methylmerc ury, the PT\yi is set to 1.6 microgram per kilogram of 



body weight per week (1.6 /ig/kg bw, see 



FAQ /WHO . 



20031 '! 



In our dynamic approach, the long term exposure is compared to a reference long term exposure 
denoted by S^'^f , and defined as the cumulative exposure of an individual whose weekly intake is 
equal to the PTWI, d, such as 

S'^'f = lim = -, (1.2) 

t^co 1 — exp(— 7/) 

where 

sr' = ±d.M-v(t-s)) = ,'-^!^tf±})}^, (1.3) 

^ exp(-r/) - 1 

For methylmercury, the reference for long term exposure S^*^-^ is 14.6 ^g/kg bw. An individual 
is then assumed to be at risk if his cumulative exposure Si^h^t exceeds the reference Sl'^^ for any 

t > 6li/2- 

This KDEM model requires some long surveys of individual intakes which are not monitored 
and can only be approximated from available consumption data and contamination data. 

1.2 From household acquisition data to household intake series 

Two current major c onsumption data sources i n Fra nce are the national survey on individual 



consumption (INCA, 



CBEDOC-AFSSA-DGAL 



I999I ) and the SECODIP panel managed by the 



company TNS SECODIP. Most quantitative risk assessments conducted by the French agency for 
food safety (AFSSA) use the 7 day individual consumption data of the INCA survey jointly with 
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contamination data collected by several French institutions. Regarding r nethylmercury, seafood - 



contaminat i on data ha ve been collected th rough different analy tical surveys 



IFREMERl . Il994-1998i l and were used in 



Tressou et al. 



(1200J) and 



MAAPAR. 



Crepet et al. 



1998-2002 



(|2005l ) combined 



with t^ 



l e INC A survey. In this paper, a methodology using the SECODIP data is developed (see 



Boizot 



20051 . for a full description of this database) . 
The company TNS SECODIP has been collecting the weekly food acquisition data of about five 
thousand households since 1989. All participating households register grocery purchases through 
the use of EAN bar codes but other grocery purchases are registered differently: the fresh fruit 
and vegetable purchases are recorded by the FL sub-panel while fresh meat, fresh fish and wine 
purchases are recorded by the VP sub-panel. The households are selected by stratification according 
to several socioeconomic variables and stay in the survey for about 4 years. TNS SECODIP provides 
weights for each sub-panel and each period of 4 weeks to make sure of the representativeness of 
the results in terms of several socioeconomic variables. TNS SECODIP also defines the notion of 
household activity which refers to the correct and regular reporting of household purchases over a 
year. For each household, the age and gender of each member of the household are retained in our 
decomposition model with some socioeconomic variables: the region, the social class (from modest 
to well-to-do), the occupation category and level of education of the principal household earner. 

For methylmercury risk assessment, the households of the VP panel are considered; in the 2001 
data set, there are H = 3229 active households (corresponding to 9288 individuals) and T = 53 
weeks during which the households may or may not acquire seafood. The weekly purchases of 
seafood are clustered into two categories (" Fish" and " Mollusks and Shellfish" ) for which the mean 
contamination levels are calculated from the MAAPAR-IFREMER data and are given in table [H 

Table [7] around here, see page [2l\ 

Household intake series {{yh,t)h=i H-t=i t) ^'^^ computed as the cross product between weekly 
purchases of seafoods which are assimilated to weekly consumptions, and mean contamination 
levels. They are expressed in micrograms per week {^g/w). The food "purchase-consumption" 
assimilation is of course arguable and will be the main subject of the final discussion (see section 
m. An additional assumption concerns the household size, denoted by rih^t for the household h 
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and the week t. This can indeed vary over time in the case of a birth or death of a household 
member. Since a new born baby wih not consume fish in his first few months, we assume that 
food diversification (and hence consumption of seafoods) starts at one year of age, yielding a total 
sample of 8913 individuals for the 2001 panel. These household intake series are then decomposed 
into individual intake series using the model described in the next section. These individual intake 
series are then used as imputs of the KDEM. 



2 Statistical methodology 



In this section, the de composit i on model is described and compared to s i milar models described in 



the literature, namely 



Cheshed (|l997l . 



19981 ): 



Vasdekis and Trichopouloul (|200d ) . Its estimation and 



some structure tests are then presented. 

2.1 The decomposition model 
2.1.1 General principle 

Consider a household composed of rih^t members, each member having unobserved weekly intakes 
yi,h,t, with i = 1,. . . ,nh,t, h = 1,. . . , H, and t = 1,. . . ,T. The week t intake of a household h is 
simply the sum across household members of the individual weekly intakes, such as 



yh,t = y^^yi,h,t- 

i=l 



(2.1) 



As detailed below, the individual weekly intake yi^h,t is assumed to depend on 



the age and gender of the individual via a function /, 



some socioeconomic characteristics of the household. 



time (seasonal variations). 



There are obviously several ways to model the individual intake un der these assumptions 



and this choice leads to more or less simple estimation procedures. In iChesheil (|1997l . 



1998|) 
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Vasdekis and Trichopouloul (|200d ). a discretization argument on age is used leading to a penalized 
least square estimation of a great number of parameters, that is one parameter for each year of age 
and gender. We propose to use a truncated polynomial spline of order 1 for each gender, which 



admits a mixed mode 
cerned 



spline representation for /. As far as socioecono mic characteristics are con- 



Vasdekis and Trichopoulou 



Chesherl (jl997l ) retained a multiplicative specification whereas 
(j2000|) chose the additive one. In the multiplicative model, a change in income for example would 
proportionally affect all the individu al intakes whereas in the additive setting, they would be af- 
fected by the same value. Following 



Vasdekis and Trichopoulou 



(|200Cl ). we retained the additive 



specification since the difference between the two specifications may not be notable, and the addi- 
tive setting yields to a much simpler e stimation procedure (linear model) . Finally, time dependency 



is only introduced in 



Chesheii (|l998l ) to track changes with age within cohorts: this time depen- 



dency is d irectly introduced into the fu nction / that is bivariately smoothed according to age and 



time (cf. 



Green and Silverman 



1994 ). Again, we adopt a simpler specification in which time is 



introduced as a dummy variable. All these assumptions yield an individual model of the form 



(2.2) 



where the terms Xi^h,tP + Zi,h,tU stand for the mixed model spline representation of the function 
/, the term Wh,tl denotes the socioeconomic effects, the term 6ta the time effect, and £i^h,t is the 
individual error term. 

Combining (j2.ip and (j2.2p , we obtain the final rescaled household model given by 



yh,t = Xh,tP + Zh,tu + ^/n^tWh,a + y/ni^Sta + en,-, 



(2.3) 



where 



i=i yi,h,tl Jn^t-, ^h,t = 2^i=i Xi.h,tl JnKJ-, Zh,t = l^i=i Zi,h,t/Jm~i, and eh,t = 



2.1.2 Specification details 

Age-gender function specification Let ai^h,t and Si^h denote the age and sex of individual i 
of household h for the t^^ week. Individual dietary intake is generally different according to the 
gender of individuals, so the function / takes the following form 

f{0'i,h,uSi^h) = fM{ai,h,t)^i^s..h=M] + ft=F}' 

where fivii-) and /f(-) are age-intake relationships for males (M) and females (F) respectively, and 
^{A} is the indicator function of event A. The function fs{-) is approximated by a spline of order 
one with a truncated polynomial basis for either sex, such as 



Ks 



fs{ai,h,t) =1^0 +Pi aiAt + ^'^k i^Kt - Ks,k). 



(2.4) 



k=l 



where the iKs,k)j.^i nodes chosen from an age list and 



iai,h,t - KS,fe)+ = - HS,k) '^{a,j,^t-Ks,k>0} 



denotes the positive part of the difference between the age of the individual ai^h,t and the node Ks^k 
and the nf are random effects assumed to be i.i.d. Gaussian with distribution Af (0,0"^^). This 
last assumption allows us to introduce some penalties into the mod el and to sm o oth the functio n 



fs yielding a mi x ed mo d el representation for t he splin e as shown in 



Brumback et al. 



19991); 



Ruppert et al. 



(12003). As in 



Speed 



Ruppert et al. 



Il99l|) 



Verbvlal (|l999l 1 



20031 ). page 125, the total 



number of nodes Kg is set to min 1 , 35}, where as^d is the list of distinct ages for individuals 
of sex S, and the nodes Ks^k are defined as the ^ Kt-l^ ) percentile of vector 05^^ for A; = 1,. . . , Ks- 
Defining Xi^t as a line vector ( 1^^^ ^^^^^ ._ = m1 ^L. , =f\ ai^K^f^ _ ) , and 



Zi ht as the line vector 



h,t - Ksh, 



}fc = l, 



,Ks ; S=M,F 



, we finally obtain the first 



terms of (iO) , that is f{ai^h,t, Si,h) = Xi^h,tf3 + Zi^h,tu. 
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Socioeconomic characteristics and time dependency In the apphcation, all the socioe- 
conomic characterics are categorical variables. Consider the Q categorical variables W^^l, q = 
1, . . . ,Q, with rriq modalities, and fix the m*^ modality as the reference modality, then the socioe- 
conomic eff'ect term in (12.21) and (12.31) is 



Q rriq-l 
q=l m=l ^ ' ' 

where 7g,m is the effect of the m*^ modality of the socioeconomic variable q. 

Similarly, time is only measured by weekly counts throughout the year so that the time effect 
in ([221) and (p^Sl) is simply 

T 

StOi = ^ arl{r=i}, 

T=l 

where ar is the effect of week r and tr is the reference week. 

Error specification The error at the individual level £i^h,t is assumed to be Gaussian with zero 
mean, and the variance-covariance structure is such that 

• households are independent, i.e. '\/i,i',t,t' and V/i ^ h' 

cov{ei^h,t,£i',h',t') = 0, 

• members of the same household are dependent, that is for \/h,t and i / i' , 

cov{ei^h,t,£i',h,t) = po-l, 

where p measures the dependence between individuals within the same household. 

• there is no time dependence, that is Vz, i' and \/t ^ t' 

cov{ei^h,t,£i',h,t') = 0. 
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In the rescaled household model (j2.3p . the error eh,t = Yli='i ^i,h,t/ ^/nhJ is i.i.d. Gaussian with 
a zero mean and a variance R such that Vt, t' and V/i 7^ h' , 

V(eh,t) = po-lnh,t + (1 - and cov{eh,t,£h',t') = 0. (2.5) 
2.2 Estimation and tests 

that c an be estimated using restricted maximum likelihood 



The model ()2.3p is a line ar mixed mode 
(REML) techniques, see lRuppert et al. 



(j2003l ) for details. An attractive consequence of the use of 
the mixed model representation of a penalized spline in (j2.4p is that mixed model methodology 
and software can be used to estimate the parameters and predict the random effect in the resulting 
household model. The amount of smoothing of the underlying functions fs is estimated with the 
REML technique via the estimation of o"^^. The estimation was conducted using (R)SAS MIXED 
procedure. To get estimators for al and asymptotic least square techniques combined with the 
linear relationship between the variance given in (12.50 and the household size were used. More 
precisely, a residual variance cj^ is first estimated for each household size n = 1, . . . , = maxn/j j 
using an option of the MIXED procedure (see the program for the detailed syntax). Then, ordinary 
least square regression and the delta method give estimators for al and p and their standard 
deviations. 

The individual intake is then predicted by 

ViJ^^t = Xi^td + Zi^h,tu + Wh,t7 + Sta, (2.6) 

where (5, 7, and a are the estimators of /?, 7, and a respectively and u is the best prediction of the 
random effect u in the model (j2.3p . 

Confidence and prediction intervals can be built for the prediction yu^^t as proposed in[ 



(j2003l ) and several tests can be conducted in this model: 



1. Are the random effects different according to sex? In other words, is the assertion a. 



^Ip = true? 
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2. Another question is the necessity for such random effects. Is the assertion o"^ = (resp. 



a: 



UM 



or a1 = 0) true? 



3. More globally, is the function / the same for both sexes? Is the assertion Jm = fs true? 

These tests can be conducted using classical likelihood (or restricted likelihood) ratio techniques. 
The likelihood ratio statistic is asymptotically distributed as a chi square with a degree of freedom 
being the number of tested equalities, except for point 2, where the l imitin g distribution is known to 



be a mixture of chi-square fjSelf and Liang . 



1987 



Crainiceanu et al. 



20031 ) because the test concerns 



the frontier of the parameter definition (cr^ e [0, +oo[). 



3 Applying our methodology to the methylmercury risk assess- 
ment 

In this section, we illustrate our approach on our motivating example. Firstly, several tests are 
conducted on the decomposition model, and secondly, individual long term exposure is compared 
to the reference long term exposure described in section [TJ 

3.1 Estimation and tests on the structure of the model 

Table [2] shows the REML estimates for all socioeconomic variables (parameter 7) and the p- values 
of Student tests in the model (j2.3p . The socioeconomic variables used are household income, region 
of residence, occupation category and level of education of the principal household earner. For each 
socioeconomic variable, the reference modality is given in Table [2j We assume here that 

• the function / differs according to the gender but the random effect does not (/m / fp and 

• the maximum household size is set to 6 for variance-covariance estimation. Indeed, the 
dependence between individuals within the same household depends on the household size 
Uh in (12. 5p . For each household size, a variance is estimated, and estimates of p and 
are obtained using asymptotic least square techniques as mentioned in section 12. 2[ Since 

12 



large households are not numerous in the database, the estimations are implemented with a 
maximum household size, N, set to 6; it is assumed that there is a common variance for all 
households with size greater than N. 

In this sub-section, we show the results of several tests we carried out to simplify the inter- 
pretation of our study. These tests have been implemented in a hierarchical way, starting with 
the highest-order interaction terms, combining to the reference modality the modality which does 
not differ significantly from the reference. All tests are performed on the 5% level of significance 
and each new hypothesis is tested, conditionally on the results of the previous tests. Each null 
hypothesis and the p- value resulting from the appropriate F-test are shown in Table El 

First of all, concerning the occupation category variable, the self-employed modality does not 
significantly differ from the reference modality blue collar workers {HI, Pval = 0.771). Refitting the 
model with the reference modality " Blue collar workers and self employed" , all the socioeconomic 
variables are significantly different from the reference. Then, F-tests allow us to conclude that the 
resulting three groups are significantly different from each other {H2, H3, HA). 

Let us now consider the region of residence variable. First, there are some very substantial 
differences among the 4 regions of residence {Hb, Pval =< 0.001). However, the modality "North, 
Brittany, and Vendee coast" and the modality "Paris and its suburbs" should be grouped [HQ c, 
Pvalc = 0.881). Then, the other tests implemented for the level of education and income variables 
suggest that no further simplification is possible (see p-values of null hypotheses H7, H8, H9 in 
Table [3]). Finally, the overall F-test comparing our resulting final model to the original model (12. Sp 
shows that no important variable has been left out of the model (Pval = 0.59). 

Table H] shows the parameter estimates and p-values of the Student's t-tests for all socioeco- 
nomic variables of the reduced final model. The income effects on individual exposure are those 
expected: the richer the households are, the higher their exposures are because seafoods are ex- 
pensive. Furthermore, living in a coastal region or in Paris and its suburbs brings about larger 
individual exposure relatively to living in a non coastal region because of the more ready supply of 
seafoods in these regions. Moreover, the more educated you are, the larger the individual exposure 
is. The occupation category of the principal household earner has an unexpected effect on the in- 
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dividual exposure. Indeed a higher exposure is expected for white cohar workers and retirees whan 
compared to blue collar workers but an opposite effect is observed. This may be explained by the 
fact that the reference modality for this variable is a very heterogeneous modality also comprising 
managers and self-employed persons (farmers and craftsmen). Another explanation could be that 
white collars workers have a higher propensity to eat out in restaurants whereas outside the home 
consumption is not included in the model. 

Table [E around here, see page [S] 
Table around here, see page \EM 
Table [7] around here, see page dH 

Likelihood ratio tests are implemented to test the structure of the final model. First, the 
dependence of individual exposures to methylmercury within a household is tested. The null 
hypothesis p = (cf. equation (|2.5p ) is rejected (null Pval) which confirms that individuals within 
the same household have correlated exposures. Then, we test if the function / is the same for both 
genders. The null hypothesis /m = fp is rejected (null Pval) but the null hypothesis o"^^^ = cr^^ 
is accepted. This means that individual exposure differs with gender but both functions need the 
same amount of smoothing. 

3.2 The cumulative and the long term individual exposure 

The cumulative individual exposure Si^h,t is calculated from the estimated individual weekly intakes 
according to equation (jl.ip and the resulting values for t > 35 are compared to the reference 
cumulative exposure defined by ()1.3p . Figure [1] shows the cumulative individual exposure over the 
53 weeks of the year 2001 for different individuals. Only certain percentiles of the distribution of 
the individual cumulative exposures of the last week are displayed. For example, the curve Pmax 
represents the cumulative exposure of an individual whose last week's cumulative exposure is the 
highest. This is the cumulative exposure of a girl who turned one year old during the 30th week of 
2001, lives in Paris or its suburbs in a well to do household. 

Very few individuals have a cumulative individual exposure above the reference long term ex- 
posure. We estimate that only 0.186% of individuals are deemed at risk. This risk index should be 
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compared to the more common one defined as the percentage of weekly intakes Di^h,t exceeding the 
PTWI, denoted Ri.e, such as Ri,e = ^ ^^=1 Eh=i E7=i 1 (A,h,t > 1-6). Ri.e is equal to 0.45%, 
and is slightly higher since each occasional deviation above the PTWI increases the risk index 
whereas only long term deviations above this PTWI should be taken into account to assess the 
risk. 

A deeper analysis of at risk individuals shows that all these vulnerable individuals are children 
less than three years old. They represent 5.29% of the children aged between 1 and 3 in 2001. 
Further, no child of a modest households is found to be at risk. 

Figure U\ around here, see page 



4 Discussion 

As mentioned in section [H the use of household acquisition data in a food safety context, and in 
our case the use of the SECODIP database for assessing methylmercury dietary intakes, gives rise 
to some approximations: 

1. Consumption outside of the home is out of the scope of household acquisition data. TNS 
SECODIP does not provide any information on the quantities of seafoods consumed ou t 
of the home or bought for outside consumption. Nevertheless, 
assert that these data are good estimates for the consumption of the whole household 



Serra-Maiem et al. 



(120031) 



Vasdekis and Trichopouloul ( 



2000) avoid this q uestion by usin g the term " availaibility" in 



Chesherl (|l997l ). auxiliary information about 



stead of intake or consumption. However, as in 
outdoor consumption could be introduced in the model as a correction factor accounting for 
the propensity to eat outside of the home according to age, sex or socioeconomic variables. 
The French INCA survey on individual consumptions gives details about inside / outside the 
home consumption for 3003 individuals people aged 3 and older. The mean outside the home 
consumption proportion is 20% for seafoods. Applying such a factor to all household intakes 
yields a long term risk of 0.226%, and i^i.e = 0.791%. Furthermore, in this case, a small 
proportion of consumers older than 3 years old are vulnerable. Nevertheless, children aged 
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between 1 and 3 in 2001 still represent the most vulnerable consumer group, at 10% of the 
corresponding population. 



2. The amount of food bought by a household can be different from the a mount actua. 
sumed. Indeed, namely for seafoods, a non negligible part is not edible: 



show than on average only 61% of fresh or frozen fish is edible. Besides, 



Favier et al 



l y con - 



1995) 



Maresca and Poquet 



(|1994| ) also demonstrate some part of the purchased food is thrown away, which also reduces 
the actual amount of food consumed by a household. However, SECODIP does not specify 
whether the quantity of fresh or frozen fish bought is ready to be consumed or as a whole fish 
that needs some preparation. Applying such a factor to all household intakes yields a long 
term risk of 0.00%, and i?i.6 = 0.043%. If both the 20% outside of the home consumption 
correction factor and the 61% edible proportion factor are applied to our series, the long term 
risk is equal to 0.021%, iJi.g = 0.13%, and 1.06% of the population of children aged between 
1 and 3 are vulnerable. These results stress that applying such a correction factor to assess 
the actual quantity consumed is probably too strong and is certainly a crude approximation 
of the quantity of seafoods ingested. Thus, a more detailed database on fish and seafood is 
needed, to realize an accurate assessment of exposure to methylmercury, taking into account 
only the edible part of fish and other seafood. 

Body weight information is crucial in a food safety context and will be included in the future 
SECODIP data since it has now been added to the list of required individual characteristics. 
The measurement error afferent to this quantity will remain however, namely for children 
whose body weight changes a lot throughout a year. Nevertheless, approximating the weekly 
body weight of young children by the median of the weekly body weight distribution available 
in French health records is the best approximation possible. 

3. The food nomenclature of the SECODIP database is not as detailed as the contamination 
database. Unfortunately, fish and seafood species are not well documented so it is not possible 
to consider more than two food categories when computing household intakes. This problem 
of nomenclature matching is ubiquitous of food risk assessments since contamination analysis 
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are generally conducted independently from the food nomenclature of consumption data. 



These arguments mainly show the disadvantages of the use of household food acquisition data 
such as the SECODIP database. Nevertheless, they also present many advantages compared to the 
individual food record survey mainly used in France in the food safety context: 

• As mentioned before, households respond for a long period of time (the average is 4 years in 
the SECODIP panel) which allows us to observe long term behaviors and avoid some well 
known biases of individual food record surveys. For example, respondents might over- (under- 
) declare certain foods with a good (bad) nutritional value either deliberately or just because 
they increased (reduced) their consumption for the short (7 days) period of the survey. 

• The individual surveys are expensive and very difficult to conduct. Highly trained interviewers 
are required and extraordinary cooperation is required from respondents. Household food 
acquisition data can serve many other applications (economics or marketing) and, at least for 
the SECODIP data, acquisition recording is simplified by optical scanning of food barcodes. 



Conclusion 

In this paper, we proposed a methodology to assess chronic risks related to food contamination 
using the example of methylmercury exposure through seafood consumption. This methodology 
includes the definition of a Kinetic Dietary Exposure Model (KDEM) that integrates the fact that 
contaminants are eliminated from the body at different rates, the rate being measured by the half 
life of the contaminant. In this paper, the estimation is based on the use of household food acqui- 
sition data which are first decomposed into individual intake data through a disaggregation model 
accounting for the dependence among household members. Several extensions of this methodology 
are currently studied. First, the disaggregation model could be improved by considering a prelim- 
inary step in which we determine what member is an actual consumer, in the spirit of the Tobit 
model. The KDEM idea is also currently being developed by studying the stabilit y and ergodic 



properties of the underlying continuous time piecewise deterministic Markov process (iBertail et al 
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20061 ). The parameters of this new model are the intake distribution, the inter intake time distri- 
bution and the dissipation rate distribution. In this framework, the dissipation parameter rj of the 
KDEM model is random and the intake and inter-intake distributions can be estimated either from 
individual (INCA-type) data or household (SECODIP-type) data. 
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Figures and Tables 

Table 1: Description of the contamination database (Unit: microgram per kilogram 





Mean 


Mill 


Max 


Standard Deviation 


Number of analysis 


Fish 


0.147 


0.003 


3.520 


0.235 


1350 


MoUusk and Shellfish 


0.014 


0.001 


0.172 


0.011 


1293 



Table 2: Restricted maximum likelihood estimates (REML) for age and all socioeconomic variables 
and the p- value of the Student's tests (Pval) 



Effect 


Parameter REML 


Pval 


Income 


(ref: Mean sup) 




Well to do 


71 


6.027 


<0.001 


Mean inf 


72 


2.686 


<0.001 


Modest 


73 


-1.928 


<0.001 


Region of residence 


(ref: 


Noncoastal regions) 


North, Brittany, Vendee coast 


74 


0.962 


0.003 


South West coast 


75 


5.232 


<0.001 


Mediterranean coast 


76 


2.303 


<0.001 


Paris and its suburbs 


77 


1.023 


0.009 


Occupation category of the principal household earner 


(ref: 


Blue collar workers) 


self-employed persons 


78 


-0.122 


0.771 


white collar workers 


79 


-3.733 


<0.001 


retirees 


710 


-5.261 


<0.001 


no activity 


711 


-1.910 


0.004 


Level of Education of the principal household earner 


(ref: 


BAG and higher 


degree) 


student 


712 


5.901 


<0.001 


no or weak diploma 


713 


-1.281 


<0.001 
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Table 3: The different steps performed in testing the socioeconomic part of our model. For each 
step, the null hypothesis tested and the p-value resulting from the appropriate F-test are shown. 
All tests are performed conditionally on the results of the previous tests (Pval) 



Null hypothesis Pval 



HI : 


78 = 


0.771 


H2 : 


79 = 710 


0.030 


H3 : 


79 = 711 


0.018 


H4 : 


710 = 711 


<0.001 


H5 : 


74 = 75 = 76 = 77 


<0.001 


H6 : 


a : 74 = 75 


<0.001 




b : 74 = 76 


<0.001 




c : 74 = 77 


0.881 




d : 75 = 76 


<0.001 




e : 75 = 77 


<0.001 




f : 76 = 77 


0.0103 


H7 : 


712 = 713 


<0.001 


H8 : 


71 = 72 = 73 


<0.001 


H9 : 


a : 71 = 72 


<0.001 




b : 71 = 73 


<0.001 




c : 72 = 73 


<0.001 



Table 4: Restricted maximum likelihood estimates (REML) for all age and socioeconomic variables 
of the reduced final model with all variance components and their standard errors (s.e) 



Efifect 


Parameter 


REML 


Pval 


Income 


(ref: Mean sup) 




Well to do 


71 


6.108 


<0.001 


Mean inf 


72 


2.760 


<0.001 


Modest 


73 


-1.915 


<0.001 


Region of residence 


(ref: Non coastal regions) 


Paris and North, Brittany, Vendee coast 


74= 77 


0.995 


<0.001 


South west coast 


75 


5.156 


<0.001 


Mediterranean coast 


76 


2.250 


<0.001 


Occupation category of the principal household earner 


(ref: Blue collar workers and self employed persons) 


white collar workers 


79 


-3.745 


<0.001 


retirees 


710 


-5.243 


<0.001 


no activity 


711 


-1.871 


0.005 


Level of education of the principal household earner 


(ref: BAC and higher 


degree) 


student 


712 


5.879 


<0.001 


no or weak diploma 


713 


-1.279 


<0.001 






REML 


s.e 


Variance of the random effect 




24.832 


6.7316 


Variance-covariance structure 








variance 




1260705 


282309 


correlation 


P 


-0.22 


0.0434 
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Figure 1: Cumulative exposure to MeHg (unit: fj,g per kg of body weight) 



23 



